Goto

Collaborating Authors

 device scheduling


Optimizing Value of Learning in Task-Oriented Federated Meta-Learning Systems

arXiv.org Artificial Intelligence

Federated Learning (FL) has gained significant attention in recent years due to its distributed nature and privacy preserving benefits. However, a key limitation of conventional FL is that it learns and distributes a common global model to all participants, which fails to provide customized solutions for diverse task requirements. Federated meta-learning (FML) offers a promising solution to this issue by enabling devices to finetune local models after receiving a shared meta-model from the server. In this paper, we propose a task-oriented FML framework over non-orthogonal multiple access (NOMA) networks. A novel metric, termed value of learning (VoL), is introduced to assess the individual training needs across devices. Moreover, a task-level weight (TLW) metric is defined based on task requirements and fairness considerations, guiding the prioritization of edge devices during FML training. The formulated problem, to maximize the sum of TLW-based VoL across devices, forms a non-convex mixed-integer non-linear programming (MINLP) challenge, addressed here using a parameterized deep Q-network (PDQN) algorithm to handle both discrete and continuous variables. Simulation results demonstrate that our approach significantly outperforms baseline schemes, underscoring the advantages of the proposed framework.


Device Scheduling and Assignment in Hierarchical Federated Learning for Internet of Things

arXiv.org Artificial Intelligence

Federated Learning (FL) is a promising machine learning approach for Internet of Things (IoT), but it has to address network congestion problems when the population of IoT devices grows. Hierarchical FL (HFL) alleviates this issue by distributing model aggregation to multiple edge servers. Nevertheless, the challenge of communication overhead remains, especially in scenarios where all IoT devices simultaneously join the training process. For scalability, practical HFL schemes select a subset of IoT devices to participate in the training, hence the notion of device scheduling. In this setting, only selected IoT devices are scheduled to participate in the global training, with each of them being assigned to one edge server. Existing HFL assignment methods are primarily based on search mechanisms, which suffer from high latency in finding the optimal assignment. This paper proposes an improved K-Center algorithm for device scheduling and introduces a deep reinforcement learning-based approach for assigning IoT devices to edge servers. Experiments show that scheduling 50% of IoT devices is generally adequate for achieving convergence in HFL with much lower time delay and energy consumption. In cases where reduction in energy consumption (such as in Green AI) and reduction of messages (to avoid burst traffic) are key objectives, scheduling 30% IoT devices allows a substantial reduction in energy and messages with similar model accuracy.


Channel and Gradient-Importance Aware Device Scheduling for Over-the-Air Federated Learning

arXiv.org Artificial Intelligence

Federated learning (FL) is a popular privacy-preserving distributed training scheme, where multiple devices collaborate to train machine learning models by uploading local model updates. To improve communication efficiency, over-the-air computation (AirComp) has been applied to FL, which leverages analog modulation to harness the superposition property of radio waves such that numerous devices can upload their model updates concurrently for aggregation. However, the uplink channel noise incurs considerable model aggregation distortion, which is critically determined by the device scheduling and compromises the learned model performance. In this paper, we propose a probabilistic device scheduling framework for over-the-air FL, named PO-FL, to mitigate the negative impact of channel noise, where each device is scheduled according to a certain probability and its model update is reweighted using this probability in aggregation. We prove the unbiasedness of this aggregation scheme and demonstrate the convergence of PO-FL on both convex and non-convex loss functions. Our convergence bounds unveil that the device scheduling affects the learning performance through the communication distortion and global update variance. Based on the convergence analysis, we further develop a channel and gradient-importance aware algorithm to optimize the device scheduling probabilities in PO-FL. Extensive simulation results show that the proposed PO-FL framework with channel and gradient-importance awareness achieves faster convergence and produces better models than baseline methods.


Asynchronous Multi-Model Dynamic Federated Learning over Wireless Networks: Theory, Modeling, and Optimization

arXiv.org Artificial Intelligence

Federated learning (FL) has emerged as a key technique for distributed machine learning (ML). Most literature on FL has focused on ML model training for (i) a single task/model, with (ii) a synchronous scheme for uplink/downlink transfer of model parameters, and (iii) a static data distribution setting across devices. These assumptions are often not well representative of conditions encountered in practical FL environments. To address this, we develop DMA-FL, which considers dynamic FL with multiple downstream tasks to be trained over an asynchronous model transmission architecture. We first characterize the convergence of ML model training under DMA-FL via introducing a family of scheduling tensors and rectangular functions to capture the scheduling of devices. Our convergence analysis sheds light on the impact of resource allocation, device scheduling, and individual model states on the performance of ML models. We then formulate a non-convex mixed integer optimization problem for jointly configuring the resource allocation and device scheduling to strike an efficient trade-off between energy consumption and ML performance. We develop a solution methodology employing successive convex approximations with convergence guarantee to a stationary point. Through numerical simulations, we reveal the advantages of DMA-FL in terms of model performance and network resource savings.


Over-the-Air Federated Averaging with Limited Power and Privacy Budgets

arXiv.org Artificial Intelligence

To jointly overcome the communication bottleneck and privacy leakage of wireless federated learning (FL), this paper studies a differentially private over-the-air federated averaging (DP-OTA-FedAvg) system with a limited sum power budget. With DP-OTA-FedAvg, the gradients are aligned by an alignment coefficient and aggregated over the air, and channel noise is employed to protect privacy. We aim to improve the learning performance by jointly designing the device scheduling, alignment coefficient, and the number of aggregation rounds of federated averaging (FedAvg) subject to sum power and privacy constraints. We first present the privacy analysis based on differential privacy (DP) to quantify the impact of the alignment coefficient on privacy preservation in each communication round. Furthermore, to study how the device scheduling, alignment coefficient, and the number of the global aggregation affect the learning process, we conduct the convergence analysis of DP-OTA-FedAvg in the cases of convex and non-convex loss functions. Based on these analytical results, we formulate an optimization problem to minimize the optimality gap of the DP-OTA-FedAvg subject to limited sum power and privacy budgets. The problem is solved by decoupling it into two sub-problems. Given the number of communication rounds, we conclude the relationship between the number of scheduled devices and the alignment coefficient, which offers a set of potential optimal solution pairs of device scheduling and the alignment coefficient. Thanks to the reduced search space, the optimal solution can be efficiently obtained. The effectiveness of the proposed policy is validated through simulations.


Efficient Wireless Federated Learning with Partial Model Aggregation

arXiv.org Artificial Intelligence

The data heterogeneity across devices and the limited communication resources, e.g., bandwidth and energy, are two of the main bottlenecks for wireless federated learning (FL). To tackle these challenges, we first devise a novel FL framework with partial model aggregation (PMA). This approach aggregates the lower layers of neural networks, responsible for feature extraction, at the parameter server while keeping the upper layers, responsible for complex pattern recognition, at devices for personalization. The proposed PMA-FL is able to address the data heterogeneity and reduce the transmitted information in wireless channels. Then, we derive a convergence bound of the framework under a non-convex loss function setting to reveal the role of unbalanced data size in the learning performance. On this basis, we maximize the scheduled data size to minimize the global loss function through jointly optimize the device scheduling, bandwidth allocation, computation and communication time division policies with the assistance of Lyapunov optimization. Our analysis reveals that the optimal time division is achieved when the communication and computation parts of PMA-FL have the same power. We also develop a bisection method to solve the optimal bandwidth allocation policy and use the set expansion algorithm to address the device scheduling policy. Compared with the benchmark schemes, the proposed PMA-FL improves 3.13\% and 11.8\% accuracy on two typical datasets with heterogeneous data distribution settings, i.e., MINIST and CIFAR-10, respectively. In addition, the proposed joint dynamic device scheduling and resource management approach achieve slightly higher accuracy than the considered benchmarks, but they provide a satisfactory energy and time reduction: 29\% energy or 20\% time reduction on the MNIST; and 25\% energy or 12.5\% time reduction on the CIFAR-10.


Matching Pursuit Based Scheduling for Over-the-Air Federated Learning

arXiv.org Artificial Intelligence

This paper develops a class of low-complexity device scheduling algorithms for over-the-air federated learning via the method of matching pursuit. The proposed scheme tracks closely the close-tooptimal performance achieved by difference-of-convex programming, and outperforms significantly the well-known benchmark algorithms based on convex relaxation. In the light of dramatically increasing numbers of mobile devices and data traffic in the Internet-of-Things era, the need for a paradigm-shift in wireless networks from traditional centralized cloud computing architectures to distributed ones is growing [1]-[5]. By performing data processing at the edge of networks, several shortcomings of cloud computing, such as long latency and network congestion, can be effectively addressed [6]-[8]. Notably, edge computing is an appealing technology to perform real-time tasks and make real-time decisions by exploiting the abundant computational resources of the edge servers [9]-[11]. H. Vincent Poor is with the Department of Electrical and Computer Engineering at the Princeton University; email: poor@princeton.edu. One way of overcoming these challenges is to integrate the edge-intelligent network within wireless networks and leverage the superposition property of wireless multiple-access channels [15]. Recently, a new paradigm of distributed machine learning, referred to as federated learning (FL) has been introduced, in which distributed devices jointly train a shared global machine learning model without sharing their raw data explicitly [16]-[18]. In essence, FL is a collaborative machine learning framework that enables distributed model training from decentralized data under coordination of a parameter server (PS) [17]. In principle FL is performed over a decentralized network as follows: 1) A PS first shares a global model with participating devices in the network. It then transmits its trained model parameters to the PS while keeping its private data locally within its own device. These steps are alternated until the global model parameters converge [16], [18], [19]. Further illustrations can be found through the comprehensive example of FL given in Appendix A. Compared to the extreme cases of centralized and individual learning, FL provides a tractable approach to handle a joint learning task over a distributed network. Nevertheless, this tractability comes with some costs which can be roughly categorized into three major forms: 1) The statistical inference problem in FL is more challenging. This follows from the fact that the local datasets in the decentralized setting are not independent and identically distributed (i.i.d.).